[Python] Best strategy for dealing with incomplete lines of data from a file.
        Posted  
        
            by adoran
        on Stack Overflow
        
        See other posts from Stack Overflow
        
            or by adoran
        
        
        
        Published on 2010-06-16T14:12:13Z
        Indexed on 
            2010/06/16
            14:22 UTC
        
        
        Read the original article
        Hit count: 162
        
I use the following block of code to read lines out of a file 'f' into a nested list:
for data in f:
     clean_data = data.rstrip()
     data = clean_data.split('\t') 
     t += [data[0]]
     strmat += [data[1:]]
Sometimes, however, the data is incomplete and a row may look like this:
['955.159', '62.8168', '', '', '', '', '', '', '', '', '', '', '', '', '', '29', '30', '0', '0']
It puts a spanner in the works because I would like Python to implicitly cast my list as floats but the empty fields '' cause it to be cast as an array of strings (dtype: s12).
I could start a second 'if' statement and convert all empty fields into NULL (since 0 is wrong in this instance) but I was unsure whether this was best.
- Is this the best strategy of dealing with incomplete data?
- Should I edit the stream or do it post-hoc?
© Stack Overflow or respective owner